Skip to content

Nominal values attribute fix#508

Closed
maxcopeland wants to merge 2 commits intoopenml:developfrom
maxcopeland:nominal_vals_attrib_fix
Closed

Nominal values attribute fix#508
maxcopeland wants to merge 2 commits intoopenml:developfrom
maxcopeland:nominal_vals_attrib_fix

Conversation

@maxcopeland
Copy link
Copy Markdown

@maxcopeland maxcopeland commented Aug 28, 2018

Reference Issue

Fixes #507
Adds list of nominal values to be passed to OpenMLDataFeature constructor in setting the OpenMLDataset.feature attribute

What does this PR implement/fix? Explain your changes.

Creates nom_val variable to store list of nominal variables if feature is nominal, stores None otherwise, then nom_val is passed to OpenMLDataFeature constructor

How should this PR be tested?

Any other comments?

xmlfeature['oml:nominal_values'] contains list as string type (e.g. `"['A', 'B', 'C', 'D']"), thus the use of splitting, lambda

@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #508 into develop will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop    #508      +/-   ##
==========================================
+ Coverage    89.59%   89.6%   +0.01%     
==========================================
  Files           32      32              
  Lines         2815    2819       +4     
==========================================
+ Hits          2522    2526       +4     
  Misses         293     293
Impacted Files Coverage Δ
openml/datasets/dataset.py 78.18% <ø> (+0.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c08dd0f...eef091a. Read the comment docs.

# split string of nominal values into type list if feature is nominal
# otherwise passing none for nominal values
try:
nom_vals = [str(x[1:-1]) for x in xmlfeature['oml:nominal_values'][1:-1].split(',')]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a brief explanation of this statement, preferably with an example string that is parsed?

@mfeurer
Copy link
Copy Markdown
Collaborator

mfeurer commented Sep 11, 2018

Thanks a lot for tackling this @maxcopeland. Please excuse the long wait time, but I was busy for the last few days finishing up a paper.

In addition to my comment, could you please:

  • add a unit test
  • from my understanding, a comma is legal in the nominal value (as long as it's properly quoted) (@janvanrijn please confirm). Therefore, I think you need to parse this a bit differently, for example using the csv module or ast.literal_eval.

@janvanrijn is there a reason that this is passed as a python-list-like string and not as a proper XML list? I assume the second one would be easier to handle by the client APIs as the parsing is handled by XML libraries.

@janvanrijn
Copy link
Copy Markdown
Member

@mfeurer no particular reason, if you prefer an xml list we can discuss it next week at the Hackathon?

@mfeurer
Copy link
Copy Markdown
Collaborator

mfeurer commented Sep 12, 2018

I definitively prefer the list to be in XML, because we can then rely on standard libraries to do the parsing. If the list is stored like it is stored right now, this requires custom parsing on the client side, which we should definitively avoid.

@mfeurer mfeurer closed this Sep 12, 2018
@mfeurer mfeurer reopened this Sep 12, 2018
@maxcopeland
Copy link
Copy Markdown
Author

No problem. Should I rewrite the parsing to expect xml list in xmlfeature['oml:nominal_values']?

@mfeurer
Copy link
Copy Markdown
Collaborator

mfeurer commented Sep 13, 2018

Thanks for your offer. I'm afraid we first need to discuss how to expect such information from the server and then implement this. We will come back to you next week.

@maxcopeland
Copy link
Copy Markdown
Author

Okay, sounds good. Just let me know once there's a consensus and I'll make the necessary changes.

@mfeurer
Copy link
Copy Markdown
Collaborator

mfeurer commented Sep 20, 2018

@janvanrijn just finished implementing getting the nominal values as an XML list. Could you please try again @maxcopeland ?

@maxcopeland
Copy link
Copy Markdown
Author

@mfeurer Great. Sure-- I'll get started.

@mfeurer
Copy link
Copy Markdown
Collaborator

mfeurer commented Mar 18, 2019

Superseded by #644.

@mfeurer mfeurer closed this Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nominal features values

4 participants